91 research outputs found

    Human Substantia Nigra Neurons Encode Unexpected Financial Rewards

    Get PDF
    The brain's sensitivity to unexpected outcomes plays a fundamental role in an organism's ability to adapt and learn new behaviors. Emerging research suggests that midbrain dopaminergic neurons encode these unexpected outcomes. We used microelectrode recordings during deep brain stimulation surgery to study neuronal activity in the human substantia nigra (SN) while patients with Parkinson's disease engaged in a probabilistic learning task motivated by virtual financial rewards. Based on a model of the participants' expected reward, we divided trial outcomes into expected and unexpected gains and losses. SN neurons exhibited significantly higher firing rates after unexpected gains than unexpected losses. No such differences were observed after expected gains and losses. This result provides critical support for the hypothesized role of the SN in human reinforcement learning

    Neural effects of cannabinoid CB1 neutral antagonist tetrahydrocannabivarin (THCv) on food reward and aversion in healthy volunteers

    Get PDF
    Disturbances in the regulation of reward and aversion in the brain may underlie disorders such as obesity and eating disorders. We previously showed that the cannabis receptor subtype (CB1) inverse agonist rimonabant, an antiobesity drug withdrawn due to depressogenic side effects, diminished neural reward responses yet increased aversive responses (Horder et al., 2010). Unlike rimonabant, tetrahydrocannabivarin is a neutral CB1 receptor antagonist (Pertwee, 2005) and may therefore produce different modulations of the neural reward system. We hypothesized that tetrahydrocannabivarin would, unlike rimonabant, leave intact neural reward responses but augment aversive responses. Methods: We used a within-subject, double-blind design. Twenty healthy volunteers received a single dose of tetrahydrocannabivarin (10mg) and placebo in randomized order on 2 separate occasions. We measured the neural response to rewarding (sight and/or flavor of chocolate) and aversive stimuli (picture of moldy strawberries and/or a less pleasant strawberry taste) using functional magnetic resonance imaging. Volunteers rated pleasantness, intensity, and wanting for each stimulus. Results: There were no significant differences between groups in subjective ratings. However, tetrahydrocannabivarin increased responses to chocolate stimuli in the midbrain, anterior cingulate cortex, caudate, and putamen. Tetrahydrocannabivarin also increased responses to aversive stimuli in the amygdala, insula, mid orbitofrontal cortex, caudate, and putamen. Conclusions: Our findings are the first to show that treatment with the CB1 neutral antagonist tetrahydrocannabivarin increases neural responding to rewarding and aversive stimuli. This effect profile suggests therapeutic activity in obesity, perhaps with a lowered risk of depressive side effects. Keywords: reward, THCv, obesity, fMRI, cannabinoi

    Role of Dopamine D2 Receptors in Human Reinforcement Learning

    Get PDF
    Influential neurocomputational models emphasize dopamine (DA) as an electrophysiological and neurochemical correlate of reinforcement learning. However, evidence of a specific causal role of DA receptors in learning has been less forthcoming, especially in humans. Here we combine, in a between-subjects design, administration of a high dose of the selective DA D2/3-receptor antagonist sulpiride with genetic analysis of the DA D2 receptor in a behavioral study of reinforcement learning in a sample of 78 healthy male volunteers. In contrast to predictions of prevailing models emphasizing DA's pivotal role in learning via prediction errors, we found that sulpiride did not disrupt learning, but rather induced profound impairments in choice performance. The disruption was selective for stimuli indicating reward, while loss avoidance performance was unaffected. Effects were driven by volunteers with higher serum levels of the drug, and in those with genetically-determined lower density of striatal DA D2 receptors. This is the clearest demonstration to date for a causal modulatory role of the DA D2 receptor in choice performance that might be distinct from learning. Our findings challenge current reward prediction error models of reinforcement learning, and suggest that classical animal models emphasizing a role of postsynaptic DA D2 receptors in motivational aspects of reinforcement learning may apply to humans as well.Neuropsychopharmacology accepted article peview online, 09 April 2014; doi:10.1038/npp.2014.84

    Dopamine, reward learning, and active inference

    Get PDF
    Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings

    Temporal-Difference Reinforcement Learning with Distributed Representations

    Get PDF
    Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting β€œmicro-Agents”, each of which has a separate discounting factor (Ξ³). Each Β΅Agent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (Ξ΄) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each Β΅Agent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments

    Expert Financial Advice Neurobiologically β€œOffloads” Financial Decision-Making under Risk

    Get PDF
    BACKGROUND: Financial advice from experts is commonly sought during times of uncertainty. While the field of neuroeconomics has made considerable progress in understanding the neurobiological basis of risky decision-making, the neural mechanisms through which external information, such as advice, is integrated during decision-making are poorly understood. In the current experiment, we investigated the neurobiological basis of the influence of expert advice on financial decisions under risk. METHODOLOGY/PRINCIPAL FINDINGS: While undergoing fMRI scanning, participants made a series of financial choices between a certain payment and a lottery. Choices were made in two conditions: 1) advice from a financial expert about which choice to make was displayed (MES condition); and 2) no advice was displayed (NOM condition). Behavioral results showed a significant effect of expert advice. Specifically, probability weighting functions changed in the direction of the expert's advice. This was paralleled by neural activation patterns. Brain activations showing significant correlations with valuation (parametric modulation by value of lottery/sure win) were obtained in the absence of the expert's advice (NOM) in intraparietal sulcus, posterior cingulate cortex, cuneus, precuneus, inferior frontal gyrus and middle temporal gyrus. Notably, no significant correlations with value were obtained in the presence of advice (MES). These findings were corroborated by region of interest analyses. Neural equivalents of probability weighting functions showed significant flattening in the MES compared to the NOM condition in regions associated with probability weighting, including anterior cingulate cortex, dorsolateral PFC, thalamus, medial occipital gyrus and anterior insula. Finally, during the MES condition, significant activations in temporoparietal junction and medial PFC were obtained. CONCLUSIONS/SIGNIFICANCE: These results support the hypothesis that one effect of expert advice is to "offload" the calculation of value of decision options from the individual's brain

    Convergent Processing of Both Positive and Negative Motivational Signals by the VTA Dopamine Neuronal Populations

    Get PDF
    Dopamine neurons in the ventral tegmental area (VTA) have been traditionally studied for their roles in reward-related motivation or drug addiction. Here we study how the VTA dopamine neuron population may process fearful and negative experiences as well as reward information in freely behaving mice. Using multi-tetrode recording, we find that up to 89% of the putative dopamine neurons in the VTA exhibit significant activation in response to the conditioned tone that predict food reward, while the same dopamine neuron population also respond to the fearful experiences such as free fall and shake events. The majority of these VTA putative dopamine neurons exhibit suppression and offset-rebound excitation, whereas ∼25% of the recorded putative dopamine neurons show excitation by the fearful events. Importantly, VTA putative dopamine neurons exhibit parametric encoding properties: their firing change durations are proportional to the fearful event durations. In addition, we demonstrate that the contextual information is crucial for these neurons to respectively elicit positive or negative motivational responses by the same conditioned tone. Taken together, our findings suggest that VTA dopamine neurons may employ the convergent encoding strategy for processing both positive and negative experiences, intimately integrating with cues and environmental context

    Coordinated Activity of Ventral Tegmental Neurons Adapts to Appetitive and Aversive Learning

    Get PDF
    Our understanding of how value-related information is encoded in the ventral tegmental area (VTA) is based mainly on the responses of individual putative dopamine neurons. In contrast to cortical areas, the nature of coordinated interactions between groups of VTA neurons during motivated behavior is largely unknown. These interactions can strongly affect information processing, highlighting the importance of investigating network level activity. We recorded the activity of multiple single units and local field potentials (LFP) in the VTA during a task in which rats learned to associate novel stimuli with different outcomes. We found that coordinated activity of VTA units with either putative dopamine or GABA waveforms was influenced differently by rewarding versus aversive outcomes. Specifically, after learning, stimuli paired with a rewarding outcome increased the correlation in activity levels between unit pairs whereas stimuli paired with an aversive outcome decreased the correlation. Paired single unit responses also became more redundant after learning. These response patterns flexibly tracked the reversal of contingencies, suggesting that learning is associated with changing correlations and enhanced functional connectivity between VTA neurons. Analysis of LFP recorded simultaneously with unit activity showed an increase in the power of theta oscillations when stimuli predicted reward but not an aversive outcome. With learning, a higher proportion of putative GABA units were phase locked to the theta oscillations than putative dopamine units. These patterns also adapted when task contingencies were changed. Taken together, these data demonstrate that VTA neurons organize flexibly as functional networks to support appetitive and aversive learning

    Effects of perceived cocaine availability on subjective and objective responses to the drug

    Get PDF
    <p>Abstract</p> <p>Rationale</p> <p>Several lines of evidence suggest that cocaine expectancy and craving are two related phenomena. The present study assessed this potential link by contrasting reactions to varying degrees of the drug's perceived availability.</p> <p>Method</p> <p>Non-treatment seeking individuals with cocaine dependence were administered an intravenous bolus of cocaine (0.2 mg/kg) under 100% ('unblinded'; N = 33) and 33% ('blinded'; N = 12) probability conditions for the delivery of drug. Subjective ratings of craving, high, rush and low along with heart rate and blood pressure measurements were collected at baseline and every minute for 20 minutes following the infusions.</p> <p>Results</p> <p>Compared to the 'blinded' subjects, their 'unblinded' counterparts had similar craving scores on a multidimensional assessment several hours before the infusion, but reported higher craving levels on a more proximal evaluation, immediately prior to the receipt of cocaine. Furthermore, the 'unblinded' subjects displayed a more rapid onset of high and rush cocaine responses along with significantly higher cocaine-induced heart rate elevations.</p> <p>Conclusion</p> <p>These results support the hypothesis that cocaine expectancy modulates subjective and objective responses to the drug. Provided the important public health policy implications of heavy cocaine use, health policy makers and clinicians alike may favor cocaine craving assessments performed in the settings with access to the drug rather than in more neutral environments as a more meaningful marker of disease staging and assignment to the proper level of care.</p
    • …
    corecore